Skip to content

Conversation

@slaren
Copy link
Member

@slaren slaren commented Sep 9, 2024

  • Avoid copy in llama_sample_dist
  • Remove lambdas from llama_sampler_chain
  • Reduce overhead in logit bias sampler when there are no biases
  • Include call to llama_sampler_accept in llama_sampler_sample

gpt_params params;

llama_batch batch;
llama_batch batch = {};
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This also fixes a crash in the server when loading a model fails and llama_batch_free is called on an uninitialized batch.

@github-actions github-actions bot added android Issues specific to Android examples server labels Sep 9, 2024
@github-actions github-actions bot added the testing Everything test related label Sep 9, 2024
@slaren slaren merged commit 5fb5e24 into master Sep 9, 2024
40 checks passed
@slaren slaren deleted the sl/sampling-re-2 branch September 9, 2024 15:10
dsx1986 pushed a commit to dsx1986/llama.cpp that referenced this pull request Oct 29, 2024
arthw pushed a commit to arthw/llama.cpp that referenced this pull request Nov 15, 2024
arthw pushed a commit to arthw/llama.cpp that referenced this pull request Nov 18, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

android Issues specific to Android examples server testing Everything test related

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants